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Summary 

There are times when knowledge of the covariance matrices of the plant and sensor noises is needed, such 
as in the design of Kalman filter. When this knowledge is not available a priori, estimates of the covariances 
may be useful. Friedland has developed a technique for estimating the unknown variances by using the sample 
correlation of the innovations from multiple observers, but he developed a detailed solution for only a particular 
example. This report extends the work of Friedland by developing a general solution for the estimates of the 
covariance matrices, specializing the solution to the case of diagonal matrices, developing the statistics (mean 
and standard deviation) of these estimates, and demonstrating the solution with two examples. 

For a particular system the absolute accuracy, or standard deviation, of the estimates and the accuracy 
of the estimates relative to each other are functions of the number of observers used and the observer gains. 
This is evident in the examples where very good estimates, that is, estimates with small standard deviations, 
are obtained for some variances while other estimates have a much larger standard deviation. For the two 
examples demonstrated, the estimates are shown numerically to be unbiased. 

Though the solution assumes that the noise variances are constant, the technique can be used to estimate 
variances that are time varying if the change is sufficiently slow. Results are presented for a time-varying 
example in which a step-function increase is introduced into the plant and sensor noise variances. The algorithm 
does estimate the new values for the variances, but there is, of course, a transient period during which the 
estimate approaches the new solutions. The length of this transient period depends on the number of samples 
of the innovations that are averaged to compute the sample correlation. 


Introduction 

There are many times when knowledge of the variances of the plant and sensor noises in a linear system 
is useful information. For example, the determination of the gains of a Kalman filter requires knowledge of 
the covariance matrices of the plant and sensor noises. Inaccuracies in these matrices result in a suboptimal 
filter and thus in suboptimal estimates of the system state. Another instance where knowledge of the variances 
is beneficial is during the testing of gyros, accelerometers, and other sensors (ref. 1). Knowledge of the 
noise variance is used in some parameter estimation problems, such as in the estimation of aircraft stability 
derivatives. As another example, it has been shown (refs. 2 through 4) that the ability to detect and isolate 
control element failures in aircraft systems is adversely affected by wind turbulence, which represents the plant 
noise in this system. Knowledge of the magnitude (variance) of this turbulence could be used potentially in an 
adaptive failure detection system to improve its performance. 

Sometimes knowledge of the noise variances is obtained via direct measurement, such as measurement of 
the noise at a sensor output. When a priori knowledge of the noise intensity is not available, estimates of 
the variances may be useful if the estimates are sufficiently accurate. The ability to obtain the estimates in a 
timely manner is also important in a real-time application, such as in an adaptive system. Friedland (ref. 1) has 
developed a technique for estimating the unknown variances by using the sample statistics of the innovations 
from multiple observers. Other approaches have been investigated by Mehra (refs. 5 and 6), Belanger (ref. 7), 
Godbole (ref. 8), Myers and Tapley (ref. 9), and Dee et al. (ref. 10). 

Friedland presents his technique in reference 1 and demonstrates it by applying it to an example of a laser 
gyro. However, he presents a detailed solution for only this example. The present report extends the work of 
Friedland by developing a general solution for the estimates and by developing expressions for the statistics 
(mean and variance) of these estimates. To illustrate the technique it is applied to Friedland’s example and to 
the example of a damped Schuler loop used by Mehra (ref. 5). 


Symbols 


A 

hypothetical matrix 

A; 

= I <g> I — Aj <g> Aj (see eq. (19)) 

a * 

row i of matrix A 


aij, bij elements of matrices A and B, 

respectively, used in the appendix 

B hypothetical matrix 

B i =Xi® X { (see eq. (20)) 



e*(fr) 

F 

G 

Qij 

H 

I 

i> 

K f 
M 

Mi 

TV 


result of intermediate calculation 
defined by equation (76) 

error in estimate (prediction) of 
x(/c) produced by ith observer 

premultiplying matrix that removes 
redundant elements from S(Y t -) 

postmultiplying matrix defined by 
equation (33) to remove unused 
columns of F(TA“ 1 B Z + U) 

elements of G 

observation matrix 

identity matrix 

indices 

gain matrix for ith observer 

matrix of blocks M t - (see eq. (37)) 

result of intermediate calculation 
defined by equation (34) 

number of samples 


w number of samples averaged to 

compute sample correlation of 
residuals 

w(fc) plant noise vector 

x(fc) state vector 

Xj(/c|fc) estimate of state vector at sample 

k produced by ith observer given k 
observations 

Xj(A: + l|fc) estimate (prediction) of state vector 
at sample k + 1 produced by ith 
observer given k observations 

y (k) vector of observations 

Z result of intermediate calculation 

defined by equation (79) 

r disturbance input matrix 

')i(k) residuals from ith observer 

7 ^(fc) a component of residual from mth 

observer 

A^ n matrix of second moments of 

sample correlations of residuals 
from mth and nth observers 


n number of states 

Pj steady-state covariance matrix of 

estimation (prediction) error for ith 
observer 

P t -(jfc) covariance matrix of estimation 

(prediction) error for ith observer 

P mn steady-state covariance matrix of 

estimation (prediction) errors e m (k) 
and e n (k) of mth and nth observers 

P mn(k) covariance matrix of estimation 

(prediction) errors e m (k) and e n (k) 
of mth and nth observers 

p number of observers 

Q covariance matrix of plant noise 

q dimension of plant noise vector 

R covariance matrix of sensor noise 

r dimension of sensor noise vector 

T = H®H 

U premultiplying matrix defined by 

equation (24) 

elements of U 

V 

v(k) sensor noise vector 


8(j, k) Kronecker delta function 

A i = fc(I - K;H) (see eq. (9)) 

\j eigenvalue of A t - 

B covariance matrix of combined plant 

and sensor noise vector £( k ) 

£ vector of plant and sensor noise 

variances, that is, of diagonal 
elements of Q and R (see eq. (35)) 

£(k) combined plant and sensor noise 

vector (see eq. (11)) 

f . elements of £ 

covariance matrix of estimate J of 
plant and sensor noise variances 

(jg h element of matrix of second mo- 

ments of sample correlations of 
residuals (see eq. (54)) 

T stacked elements of sample corre- 

lation of residuals from p observers 
with redundant elements removed 

Yj correlation of residuals of ith 

observer (see eq. (15)) 

Tj sample correlation of residuals of 

ith observer (see eq. (27)) 

$ state transition matrix 
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Xi = [T,-*K i ](seeeq.(10)) 

Subscripts: 

t, m, n ith, mth, and nth observer, 

respectively 

Superscripts: 


T 


matrix transpose 


# generalized inverse of the matrix 

(see eq. (40)) 

estimated value 


Operators: 

E{ } expected value 

S( ) stacking operator defined by equa- 

tion (17) 


a, /?, 6, 6 , /i, v denotes component of vector; or if 
in pairs, denotes element of matrix 


Kronecker product defined by 
equation (16) 


Theoretical Development 
The System 

The first step in the development of a solution for the estimates of the variances of the plant and sensor 
noises is to describe the system and to formulate the problem in a mathematical framework. To this end 
consider a discrete, linear, time-invariant system described by the state space equations 


x(fc + 1) = x(fc) + T w(fc) 
y{k) = H x(fc) + v(fc) 


( 1 ) 

( 2 ) 


where x(fc) is an n dimensional state vector, y(fc) is an r dimensional vector of observations, and the vector 
noise sequences w (k) and v(k) are white and Gaussian with a mean of zero. The vectors w (k) and v(k) have 
the dimensions q and r, respectively. Assume that 


E{w(fc)} = E{v(ifc)} = 0 

E{w (j) w T (fc)} = Q S{j,k) 
E{v(j) v T (k)} = R6{j, k ) 
E{v(j) w T (k)} = 0 


( 3 ) 


where the covariance matrices Q and R are constant but unknown, E{ } denotes the expected value, and 
6 (j, k ) is the Kronecker delta function. The matrix O is the state transition matrix, T is the disturbance input 
matrix, and H is the observation matrix. The matrices T, and H are assumed to be constant and known. 

The problem is to estimate the unknown noise covariance matrices Q and R. The solution to this problem 
will be developed in the following steps: (1) define a set of observers for the system, (2) develop a (linear) 
relationship between the covariance of the observer estimation error and the covariances of the plant and 
measurement noises, (3) develop a (linear) relationship between the correlation of the observer residuals and 
the covariance of the observer estimation error, (4) from the results of steps (2) and (3) develop an expression 
for the correlation of the observer residuals in terms of the covariances of the plant and measurement noises, 
(5) develop an expression for the sample correlation of the observer residuals, (6) use the sample correlation 
of the residuals as an estimate of the true residual correlation to calculate an estimate of the plant and 
measurement noise covariances utilizing the solution found in step (4), and (7) develop expressions for the 
mean and the standard deviation of the estimates of the covariances. 

The Observers 

For the system described by equations (1) and (2), consider the ith observer described by 

Sii(k + l|fc) = $ 5ii(k\k) (4) 
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Xj(fc|/c) = Xj(fc|& - 1) + Kj ')i(k) (5) 

where Kj is the gain matrix for the tth observer, and ')i(k), the vector sequence of observer residuals, is given 
as 

li{k) = y(k) — H 5ii(k\k - 1) (6) 

Define the observer error as the difference between the actual state of the plant and the observer’s estimate 
(prediction) of the state as follows: 

e i(k) = x(fc) — Xj(fc|fc — 1) (7) 

From equations (1), (2), and (4) through (7), 

e^(fc + 1) = x(fc + 1) — Xj(& + l|fc) 

= $ x(fc) + T w(i fc) - $ { 5ci{k\k - 1) + KJH x(fc) 

+ v(fc) - H 5c£(fc|A: — 1) ] } 

- *(I - KjH) ej(fc) + T w (k) - &Ki v{k) 

= A i ei{k) + Xi ((*) (8) 

where a closed-loop observer matrix A;, a new disturbance input matrix and a combined plant/sensor noise 
vector £(&) are defined by 


A. t = <&(I KjH) 


X i = F,-*K i ] 


f(fc) 


w (k) 
v(fc) 


The covariance Pj(fc) of the estimation (prediction) error is 


P iW = E {ej(A:) ef (A:)} 


( 9 ) 

( 10 ) 

( 11 ) 


( 12 ) 


If Kj is chosen such that Aj is asymptotically stable (that is, eigenvalues A j of A j such that |Aj| < 1), then 
the steady-state covariance Pj is the solution of 


Pj - AjPjAf = XjSxf 

where 

E = E{t(k) ^(A:)} 


Q 

' 

0 

-° 

rJ 


(13) 


(14) 


Such stabilizing gains Kj can be found (e.g., by using the Kalman filter algorithm) if the system (eqs. (1) and 
(2)) is detectable. Equation (13) is a linear relationship between the covariance of the estimation error and 
the covariances of the plant and sensor noises. Thus, step (2) is complete. 
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Correlation of the Residuals 

The correlation of the residuals of the fth observer is given by 


T i = E{n(k)'i?(k)} 

= E {[y (k) - H - 1)] (y(fc) - H x,-(*:|A: - 1)] T } 

= e|[H e t -(fc) + v(fc)] [H ei(k) + v(fc)] T j 

= HPjH T + R (15) 


Equation (15) relates the correlation Tj of the residuals to the observer estimation error Pj as desired in 
step (3). The residual correlation Y t now can be linearly related to the plant and sensor noise covariance S 
by eliminating P; from equations (13) and (15) as follows: 

As shown by the example in the appendix, the Kronecker product (ref. 11, p. 22) can be defined as 


A ® B = 


' anB ai2B • ■ • aj n B ' 
021 B <2220 • • ■ U2 nB 

-<2mlB a m 2 B ®mn B. 


(16) 


Define the stacking operator S( ) such that if a 4 - are the rows of the m x n matrix A, then S(A) is an mn x 1 
column vector as follows (see the appendix for an example): 


S(A) = 


(17) 


From using this notation, the solution of equation (13) for Pj in terms of S is given by (ref. 12, p. 89) 

[I ® I-A< <8>A t ] S(P i ) = S( X .Sxf) 

= [Xi ® xj S(B) (18) 


Let 

A t - = I ® I — Aj ® A i (19) 

Bj = Xi ® Xi (20) 

Then, 

A, ; S(P i) = Bi S(S) 

S(P<) = Ar ^ i S(B) (21) 
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A solution to equation (18) is assured if the eigenvalues of A* have a magnitude less than unity (ref. 13, p. 181, 
prob. 2.6-2). Rewriting equation (15) gives 


S(Tj) = S(HP;H r ) + S(R) 

= (H ® H) S(Pi) + S(R) 
= T S(Pj) + S(R) 


where 

But S(R) can be written as 
where 

U ij ~ ^ 


T = H ® H 

S(R) = U S(B) 

/ 1 < l < r; > 

(/ — l)r 1 < i < l r; 

V j = i + q(q + r) + lq J 


( 22 ) 


(23) 


(24) 


Uij = 0 (elsewhere) 

Combining equations (21), (22), and (23) gives 

S(Tj) = TA^B i S(B) + U S(B) 

= (TA^B,- + U) S(B) (25) 

But the matrix T,- is symmetric; thus, elements Y t - (m, n) are redundant for m > n. To reduce the size of the 
arrays and to avoid potential problems with taking the matrix inverse, remove these elements from S(T*) and 
remove the corresponding rows of TA“ 1 Bj + U as follows: Define a matrix F of dimension |\r 2 + r)/2j x r 2 

as the r 2 x r 2 identity matrix with the following rows removed: r + 1; 2r + 1, 2r 4- 2; 3r + 1, 3r -I- 2, 3r + 3; . . .; 
(r — l)r + 1, . . ., r 2 — 1. Equation (25) now becomes 

FS(Yj) = F(TA7 1 B i + U) S(B) (26) 

We now have a linear relationship (eq. (26)) between the correlation Tj of the residuals and the covariance B 
of the noises, thus completing step (4). 


Sample Correlation 

Let Y i(k) be the sample correlation of the residuals ^ computed from 

Yj(fc) = 1^2 'll (»] 

3 = 1 

= ^ %{k - 1) + l [7iW if (*)] (27) 

In the software implementation of equation (27), only elements Y,(m, n) for m > n are computed since the 
matrix Y (A;) is symmetric. 

The sample correlation Y t -(& ) is an estimate of Y 2 - as required in step (5), and S(Y 2 -(fc)) could be used in 
equation (26) to solve for B, an estimate of B. However, S(B) is (q + r) 2 x 1, whereas F S(Y*(fc)) is only 
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[(r 2 + r)f 2] x 1; thus, there are (r 2 + r)/2 equations in ( q + r) 2 unknowns. Actually, 2qr elements of B are 
always zero, so there are only q 2 + r 2 unknowns. 

This difficulty can be overcome by using multiple observers, with a different gain K t for each observer and 
thus for different residuals 'u(k). Then, 


's (fi(k)y 


TA^Br + U' 

S(f 2(k)) 

= 

TAj X B2 + U 

.S(T P (A:)). 


.TA-iBp + U. 


S(S) 


(28) 


The problem here is that a necessary condition for a unique solution of equation (28) is that 

q 2 - t-r 2 
~ (r 2 +r )/ 2 


(29) 


Thus, p can be quite large, and the number of observers required can be larger than practical. As an illustration, 
for Mehra’s example (ref. 5) which will be discussed later, q = 3, r = 2, and thus p > 5. By taking advantage 
of the fact that Q and R are symmetric, the number of unknowns can be further reduced to (q 2 + q + r 2 + r)/ 2, 
but this matter will not be pursued here. 


Diagonal Covariances 

Although equation (28) can be solved for the estimate B of the covariances, the number of unknowns in B 
(and thus in B) can be reduced considerably if the problem is restricted by making the frequent assumption 
that the covariances Q and R are diagonal; that is, the components of w (k) are uncorrelated with each other, 
and similarly for v{k). Then, 3 is also diagonal, the number of unknown variances is reduced to q + r, and the 
number p of observers required for a solution may be reduced also. In this case, 


Q 1 


E{w(k) w T {j)} = 


<72 

0 


0 


<5(i, k ) 


Qq J 


(30) 


such that B becomes 


r\ 


E{v(fc) v r (y)} = 


r 2 


0 


0 


<H;> k) 


r T J 



_ 


rfi , 

■ 

Q 

0 


i 2 

0 

0 

R J 


0 

£ q+r - 




. 


(31) 


(32) 
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Use of equation (32) in equation (26) shows that only q + r columns of F(TA ?; 2 Bj + U) are multiplying 
nonzero elements of S(B). To remove the assumed zero elements from S(S) (off-diagonal elements of S) and 
to eliminate the corresponding unused (q + r) 2 - (q + r) columns of F(TA“ 1 B i + U), define a new matrix G 
with dimensions ( q + r) 2 x {q + r) and elements g^j such that for 1 < j < q + r and 1 < i < (q + r) 2 , 

f 1 (for 1 <j<q + r;i= (j - 1 )(q + r) + j) 

9ij = i (33) 

( 0 (elsewhere) 

See the appendix for an illustration of G. Define a new matrix M 2 by 

M, = F(TA J ” 1 Bj + U)G (34) 

(Eliminate all columns except columns j, j — 1, q + r + 2, 2 (q + r) + 3, S(q + r) + 4, • • •, (q + r - 1) 
(q + r) + q + r = (q + r) 2 .) Similarly, define a new vector of unknown variances £ as 


r |i 1 


‘91' 

I2 


<72 

i, 

= 




r 1 

-tq+r - 


- T r - 


G t S(5) 


(35) 


Corresponding to equation (26), the relationship between the correlation of the residuals and the covariances 
of the noises (step (4)) for the case of diagonal covariance matrices becomes 


F S(Ti) = (36) 

Equation (36) is a set of (r 2 + r)/2 equations in q + r unknowns. 

If (r 2 + r)( 2 < q + r, thenp observers can be used, as in equation (28), such that p(r 2 + r)/2 > q + r. 
Using the sample covariances T t -(fc) as estimates of the residual covariances T 2 (fc), equation (37) can be used 
to compute an estimate £ of the plant and sensor variance vector £. Thus, 



FS (TjW) 
F S(f 2 (*)) 


■ Mi ' 

m 2 

FS(T p (fc)). 


- M p . 


1 = M| 


If rank (M) = p(r 2 + r)/2 and p(r 2 + r)/2 = q + r, then the solution is 


£ = M _1 T 


(37) 


(38) 


Otherwise, a generalized inverse M# must be used such that 

t = M*f 
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(39) 



If M is full rank, that is, rank (M)= q + r, a least-squares solution for £ is given by equation (39) when M# 
is given by 

M* = (M r M) -1 M T (40) 

Thus, we have developed a general solution for the estimates £ of the unknown elements of a diagonal noise 
matrix S in terms of the sample correlation Y of the residuals of the observers, thus completing step (6). 

Statistics of the estimated covariance. We next develop an expression for the mean and the standard deviation 
of the estimate of the plant and sensor noise variances for use in evaluating the accuracy of the estimation 
process. Before doing that, we need to develop some preliminary relationships. Mehra (ref. 5) has shown that 
the innovation sequence ~j(k) from a suboptimal filter is a Gaussian process. This sequence has a mean of zero 
in the steady state. Consider the following: 


Eh(fc)} = E{y(fc)-Hx t (fc|fc-l)} 

= E{H x(k) + v(k) - H Xi(k\k - 1)} 

= H E{e;(A:|fc — 1)} + E{v(fc)} (41) 

In equations (3) it was assumed that E{v(fc)} = 0. The error ej(k) in the estimate x^(kfk — 1) of the state was 
given by equation (8) to be 

e f (fc) = $(I - KjH) e { (k - 1) + fw (fc - 1) - v(k - 1) (42) 


Equation (42) has a solution 


k 

e t (k) = [*(I - K*H)] fc e,-(0) - £[*(I - K t -H)] J ' _1 ®K t - v(k - j) 

3 = 1 
k 

+ ^[4&(I-K i H)r 1 r w(fc-j) 

3 = 1 


(43) 


such that 

E{e i (fc)} = [^(I-K i H)] fc E{e,(0)} 
k 

- £>(I - KiKW-'QKi E {v(k-j)} 
3 = 1 
k 

+ ^[®(I-K t -H)H“ 1 r E{w(k-j)} 

3 = 1 


(44) 


From equations (3), E{v(/c — j)} = 0 and E{w(/c — j)} = 0. Furthermore, if the filter is asymptotically stable, 
then in the steady state [<&(I-KjH)] fc decays to zero, as k approaches infinity; thus, E{ej(/c)} = 0. Therefore, 
the innovations 'i l {k) have a mean of zero in the steady state. 

Before proceeding with the development of the statistics of the estimates of the noise variances, consider the 
second moment of the errors e m (k + 1) and e n (k + 1) in the prediction estimates x. m (k + l|fc) and x n (& + l|fc) 
of the state produced by the mth and nth filters, respectively. Thus, 


p mn{k + 1) = E{e w (fc + 1) (fc -I- 1)} 

= $(I - K m H) E{e w (fc) e£(fc)}(I - K n H) T ^ T 
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+ r E{w(fc) w T (fc)}r T + $K m E{v(A:) v T (/s)}K£* r 
= <&(I - K m H) R mn (k) (I - K„H) t $ t + TQr T 

+ <&K w RK£<& t (45) 

In the steady state, equation (45) for the second moment of the prediction error becomes the discrete Lyapunov 
equation, which can be solved to obtain the error covariance P mn . Thus, 

P mn = <&(I - K m H)P mn (I - K n H) r * r + TQr T + <&K m RKn<& T (46) 


Mean of the estimates. Now let us proceed with the task of determining the mean and the standard deviation 
of the estimates of the variances of the plant and sensor noises. The mean of the vector £ of estimated variances 
is given by 


E{£} = M* E{f } 


'FS(Ti)' 


= M# E 


ILFS(T P ). 
'F S(E{Ti>r 


= M # 


|_F S(E{T P }) 


(47) 


where is the sample correlation of the residuals of the fth filter defined by equation (27). The expected 
value of Yj can be found from 

, w 

= - E E {[y(0 - H - !)] [y(0 - H ZiW - 1 )] T } 

1=1 
, w 

= - E E {[H e,-(0 + v(/)] [H e,-(i) + v(/)] T } 

1=1 

1 W 

E{T;} = - E [ H p ii(0 hT + R ] (48) 

l=l 

In the steady state this becomes 

E{Tj} = HPjjH r + R 
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(49) 



Therefore, the expected value of the estimate of the variances is 


E{?.} = M* 


F S (HP n H r + R) 

[fs(hp pp h t + r)_ 


( 50 ) 


Equation (50) is an expression for the mean of the estimated variance, the first statistic to be developed in 
step (7). 


Mean square of the estimates . The mean square, or second moment, of the estimates of the variances of the 
plant and sensor noises can be found similarly, but the computations are somewhat more tedious. Thus, 


E{|| T } = M* E{f f T } (m#) T 


(51) 


The second moment of the sample correlation of the residuals can be found as follows: 

'FS^) 


E{T f } = E 


1 [fs(t p )_ 


[fs(t,) ... FS(t„)] 


where 


A^ n = E{Fs(f m ) S r (f n ) F r } 

sr (if ■*(*,*«) ff} 

= [°th] 


(52) 


(53) 


In terms of notation, A^ n is the block in row m and column n of blocks when the matrix E {f f T } is 

partitioned with regard to observers. The variable a\ h is the element in row g and column h of E (f T H and 
is expressed by I J 


2 U) VJ 

a th = -£ E E {^m(0 7m(0 l 6 n(k) 7^*)} 

1=1 k=l 1 


(54) 
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where „ _ i m(r 2 + r) 

g = — — ( 2 r + 2 — a) + P + l — a + 

(1 < a < r; a < j3 < r) 

fl -1 n (r 2 + r) 

h - - 5 - (2r + 2- 6) + 6 + 1-6+ — i — L 

z z 

(1 <6<r- 6 < 8 < r) 

and denotes the a component of the vector 7 m (/). 

Since the random variables 7 ^ (j ) are Gaussian with a mean of zero, the expected value of the product of 
four variables as in equation (54) is given by (ref. 14, p. 274, prob. 8-14) 

, W W 

<4. = 3EE[ E {«') 1&«)} e{tJ( k) +k)} 

i=lfc=l 

+ E { 7 m(0 7n(^)} E { 7 m(0 7n(^)} 

+ e{7“(/) t£(*)} E{7&(Z) 7^)}] (56) 

To this point we have found the mean square of the estimate £ of the variances of the plant and sensor 
noises in terms of the second moments of the sample correlations T of the filter residuals (eq. (51)). For each 
element o* h of the second-moment matrix of the sample correlations of the residuals, we found an expression 
in terms of the second moments of the residuals 7 ^ (eq. (56)). 

Let us turn now to consider the general term E {7m(0 7 %{k)} of the matrix of second moments of the 
residuals. First, consider 

E { 7 m(l) In (k ) } = E {[H e m (l) + v(/)] [H e n (k) + v(fc)] T } (57) 

Equation (57) becomes 

E {7 mil) 7«W} = HE {e m (0 el (k) }h t + E { v(Z) e£ (k ) } H T 

+ H E {e w (/) v T (k ) } + E {v(Z) v T (k ) } (58) 

From equations (3), 

E jv(/) v r (£)} = R 6(1, k) 

We will examine the remaining terms of equation (58) for three different conditions: l < k, l = k, and l > k. 
First, assume that l > k. Note that the state estimation error can be written as 

l—k 

e m (l) = [* (I - K m H)]'- fc e m (k) — £ [* (I - K m H)] i-1 <&K m v(Z - i) 

i = 1 

l—k 

+ £[$(I-K m Hf 1 rw(l-i) (59) 

i—1 
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Substituting equation (59) into (58) gives 


E{e m (Z) e£(fc)}= [^(I-K m H)]'- fc E{e m (fc) e£(fc)} 
l-k 

- £ [* ( r - KmHJF' 1 1>K m E jv(Z - t) e£(fc) } 
i = 1 

l—k 

+ £ [* ( r - KmH)] 1 '- 1 r E { w(Z - *) e£ (A) } (60) 

2=1 

But from equations (45) and (46) we get 

E{em(fc) e n(&)} = Pmn(fc) (01) 

and since the prediction estimation error does not depend on current or future noise samples, the result is 


E {v(/ — *) e^(/c)} = 0 ) 

! ; \ ( i~i>k ) 

E{w(Z-») e£(fc)J = 0 J 


(62) 

Therefore, the first term on the right side of equation (58) becomes 



H E {e m (0 e£(fc) } H r = H [« (I - K m H)]'~ fc P mn (fc) H r 

(Z > k) 

(63) 

the second term of equation (58) becomes 



e{v(Z) e£(A;)}H T =0 (Z > k) 


(64) 

and the third term of equation (58) becomes 



HE {e m (Z) v r (fc)} = H [« (I - KmH)]'-* E {e m (A;) v r (fc)} 



l-k 

-H^[$(I- K m H)] i_1 <&K m E |v(Z 
i— 1 

-*) v r (fc)} 


l-k 

+ H £ [* (I — K^H)]^ 1 r E |w(Z - t) 

7=1 

' v T (*)} 


= -H [S> (I - Kmil)} l ~ k - 1 *K m R 

(/>*) 

(65) 

Now assume that l < k. As in the development of equations (60) through (65), it can be shown that 


E {e w (Z) e^(fc)} = P wn (Z) [(I - K n H) r * T ] (Z < k) 

(66) 

E{v(Z)eJ(Z:)} = -RKy[(I-K n H) T $ T ] fc ' * 
Now assume that Z = k. In this case, 

(Z < k) 

(67) 

E |e w (^) e n (^) } = P mn{k) 


(68) 
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(69) 


E {v(fc) e£(fc)} =0 

E {e m (fc) v T (fc) j = 0 (70) 

E{v(fc) v T (fc)} =R (71) 

Combining equations (58) and (63) through (71) gives the correlation matrix for the filter residuals as 

[ H[#(I-K m H)]'-* P mn (k) H t 
- H {* (I - K w H)] i - fc - 1 $K m R 
H P mn (fc) H t + R 

HP mn(l) [(I-K n H) r $ T ]^H T 

i k~l— 1 


v{'1m(l)'lZ(k)} = 


(i l > k) 
(l = k) 


(72) 


[ - RK£$ t [(I - K n H) r $ T ] H r (l < k) 


Now, we restate the general term a gh (given previously in eq. (56)) in the matrix of second moments of the 
sample correlations of the residuals: 

* W W 

= -2 E E [e{t&(0 t&(o} e {^(fc) i s n ( k )\ 

i=i k=i 

+ E{ 7 “(0 7^(fc)} E {^(l) li(k)} 


+ 


E{^(l)7tW} E {^(l) riik)}] 


Each term E 7n(^)} ' s given by the entry in the /ith row and the i/th column of the residual correlation 

matrix in equations (72). The general term then becomes in the steady state 


a A gh = [HP mn H r + R ] a ' [HP mn H T + r] 


1 86 


+ i | [HP mn H T + R] a6 [HP mn H r + R ] 08 
+ [HP mn H T + R\ aS [HP mn H r + R]^} 

1=1 k=l+ 1 1 L J L 

Cl (xl) k ~ , ~‘ H^]“‘ [cj (x T n ) k ~‘~' H T ] 136 1 


06 


w l—l 


+ ^ £ t { »]“' [«x^'- 1 C m ] 

1=2 k=l 1 


06 


where 


X =<&(!- K m H) 


(73) 


(74) 
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(75) 

(76) 


xJ = (I-K n H) r * r 

Cm = $ [(I - KmH) P mn H T - K m R] 

Cl = [HPmn (I - KnH) T - RK^] <& T (77) 

and [Z] M " denotes the entry from row fi and column v of matrix Z. In equation (73) the steady-state form P mn 
of the error covariance was used because the sample correlation (eq. (27)) is computed in the usual application 
of this algorithm over a window of data of length w which begins after the filter has reached steady state. The 
double summations in equation (73) can be simplified to single summations as follows: 

1—2 fc=l 
1 w— 1 

= i E(“-J){l z l“'[ z l'’ , + [ z l“‘[ z l'”} ( 7 «) 

J = 1 

where 

Z = H X J m - 1 C m (79) 

Equations (46), (50) through (53), (56J, and (72) through (78) can be used to calculate the mean vector 
and second-moment matrix of the vector £ of estimates of the plant and sensor noise variances and R jj, 
respectively, assuming that Q and R are diagonal. The covariance matrix of the estimate £ is easily found 
from 

e k = e{[?-e{J}][2-e{J}] t } 

= E {f! T }~ E {I} e{2} t 

= M # E {rf T } - (m # E {t}) (m # E {X}) T (80) 

The standard deviations of the estimates of the noise variances are just the square roots of the diagonal terms 
of £££. Step (7) of the development of the general solution is now complete. 

Examples 

To demonstrate the use and performance of the technique for estimating noise covariances, the procedure 
was applied to two examples: one example was a ring laser gyro used by Friedland (ref. 1), and the second 
example was a damped Schuler loop used by Mehra (ref. 5). For both examples the covariance matrices of the 
plant and sensor noises were diagonal. For each example, equations (46) and (50) were used to calculate the 
theoretical mean E {^ j of the estimate of the variance £. Equations (51) through (53), (73), and (78) were 

used to calculate the second moment E {11} of the estimate, and equation (80) was used to calculate the 
variance and hence the standard deviation of the estimate. These statistics were computed as functions 
of w, the number of samples averaged to obtain the sample correlation of the residuals. 

In addition, each example system was implemented in digital simulation together with the variance 
estimation algorithm (eq. (39)). The sample statistics (mean and standard deviation) of the estimates were 
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computed using a sample of 100 runs of the simulation, each with a different seed for the random number 
generator. In the simulation the sample correlation of the residuals (eq. (27)) was calculated using a first-order 
recursive low-pass filter of the form 

fi(k + 1) = [1 - (1 /w)] Ti{k) + (1 /w) 'u(k) 7 f (k) (81) 

This approach used less computer memory than storing a window of residuals (window length denoted by w). 
Values of w of 200, 400, 800, 1600, and 4000 samples were used. 


Friedland’s Example 


The laser gyro example (ref. 1) was a two-state system with angular position and angular rate as the states. 
The discrete-time version of the system for a sample time of 15 sec resulted in a state transition matrix as 
follows: 

f 1 15 1 


$ = 


0 


1 


(82) 


The observed quantity was the angular position, and thus the observation matrix was given by 


H = [ 1 0] 


(83) 


The noise input matrix was assumed to be 



0 

1 


(84) 


The problem was to solve for estimates £ of the unknown diagonal elements of the plant and measurement 
noise covariance matrices Q and R. However, to implement the system in simulation, the following true values 
for the noise covariances Q and R were assumed: 


Q = 


1 o 

0 0.25 x 10~ 4 


(85) 


R = 0.25 

For this example, the dimensions were r — 1 and q = 2, and 

p{r 2 + r )/ 2 > q + r 


(86) 


(87) 


or 

p > 3 

Therefore, three constant gain observers were employed with Friedland’s values (ref. 1) for the observer gains. 
These gains were obtained by Friedland using the Kalman filter algorithm and assuming a different set of values 
for the noise covariances for each observer. 

The theoretical means of the estimates are plotted in figure 1(a) together with the sample means. As 
expected for an unbiased estimator, the theoretical means equal the true values, and the sample means show 
good agreement with the theoretical and true values. The theoretical and sample standard deviations are 
plotted in figure 1(b). The theoretical and sample values agree fairly well for Q22 and Rq. For Qn the 
sample standard deviations are slightly larger than the theoretical values but follow the same general trend. 
The theoretical standard deviations are replotted in figure 1(c) in terms of percent of the true value. As in 
figure 1(b) note that the standard deviations are approximately proportional to the inverse of the square root 
of the number of samples averaged to obtain the sample correlation of the residuals. Note also that with these 
gains the algorithm is least accurate in estimating Q22 and most accurate in estimating Qn, which can be 
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estimated with a standard deviation of less than 20 percent of its true value when 300 or more samples are 
averaged. 

In four simulation runs Friedland obtained estimates (with w — 100) ranging from 0.88 to 1.27 for Qn, 
from 0.142 x 10 -4 to 0.35 x 10~ 4 for Q 22 , and from 0.17 to 0.36 for Rn (ref. 1). These estimates all fall within 
one theoretical standard deviation of the true values. 

Mehra’s Example 

Mehra’s example had two states for the loop dynamics and three augmented states to account for the 
correlated noise resulting in a combined five-state system. The system matrices were as follows: 


$ = 


f0.75 -1.74 -0.3 


0.09 0.91 

0 0 

0 0 

. 0 0 


H = 


r = 


1 
0 
0 
0 

24.64 

0 

L 0 


-0.0015 

0.95 

0 

0 

0 0 0 
1 0 1 
0 
0 
0 

0.835 

0 


0 

0 

0 

0.55 

0 

r 

0 

0 ' 
0 
0 
0 

1.83. 


-0.15 ' 
-0.008 
0 
0 

0.905 . 


For purposes of simulation, the assumed values for the noise covariance matrices were as follows: 


( 88 ) 


(89) 


(90) 



-1 

0 

0- 

Q = 

0 

1 

0 


.0 

0 

1. 


R = 


1 0 
0 1 


For this example, r — 2 and q = 3, and 


p (r 2 + rj /2 


>q + r 


(91) 


(92) 


(93) 


or 

p > 2 

Both two-observer and three-observer estimators were evaluated. The following observer gains were used: 


0.953 

0.772 - 

0.28 E— 2 

0.338 

-3.5 

-1.49 

-0.176 E— 3 

0.35 

0.0319 

-0.77. 


(94) 
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k 2 = 


K 3 = 


0.99 

1.3 ' 

0.5 E— 2 

0.26 

-2.9 

-0.77 

-0.3 E— 3 

0.21 

0.04 

-0.816. 

1.03 

0.5 - 

0.5 E— 2 

0.18 

-1 

-1.25 

-0.5 E— 3 

0.22 

0.05 

-0.54. 


(95) 


(96) 


These gains were not chosen through a systematic process to optimize the estimator performance. However, 
they were selected through trial and error to be at least in the vicinity of a local minimum for the theoretical 
standard deviation of the estimates. Gains Ki and K 2 were used for the two-observer case. 

The theoretical means of the estimates for the three-observer case are plotted in figure 2(a) together with 
the sample means, and as expected the theoretical means equal the true values. Again, the sample means 
show good agreement with the theoretical and true values. The theoretical and sample standard deviations 
are plotted in figure 2(b) and are in good agreement. The algorithm does an excellent job of estimating Qn, 
but not as good of estimating Rn- The relative accuracies of the estimates can be changed by adjusting the 
observer gains. In the two-observer case with these gain matrices, the standard deviation of the estimates of 
Qn and Q 22 were not greatly different from the three-observer case, but the estimates of Q33 and Rn were 
much worse with standard deviations that were larger by a factor of 7 to 8. 

Figure 3 shows time history plots of the estimated values of £ of the noise variances for w = 200 (fig. 3(a)) 
and w = 4000 (fig. 3(b)) for the three-observer estimator. The estimates with w = 4000 are, of course, much 
smoother. Note in figure 3(b) the extremely smooth curve for = Q(l, 1) as predicted by the standard 
deviation of approximately 0.04 for this estimate. On the other hand, the estimate £ 4 = R(l, 1) is much 
noisier commensurate with an expected standard deviation of approximately 0.7. Note that the estimate 
R(l, 1) appears to take on negative values at times. (The time history plots of R(l, 1) in fig. 3 are limited 
between 0 and 2.) These negative values are caused by the values of the sample correlation T of the residuals 
not being accurate estimates of the true correlation T, particularly with w = 200. 

Although the algorithm was derived on the assumption of constant plant and sensor noise variances, the 
technique can be used to estimate slowly varying noise variances, that is, slowly varying in comparison to the 
time constant of the averaging filter, or the number of samples averaged to compute the sample correlation of 
the residuals. An idea of the transient response of the algorithm can be obtained from figure 4, which contains 
time history plots for the case where the original noise variances are increased in a step-function manner by a 
factor of 3 at sample number 5000. Following the increase the estimates are seen to approach the true values 
slowly, except for = Q(3, 3). Even though Q(3, 3) does not approach its true value during the period plotted, 
its value after the step increase is larger than in figure 3, where the true variances were held constant. Of 
course, the transient response would have been faster for smaller values of w, even though the estimate would 
have been “noisier” . 

This ability to estimate slowly changing variances is important for practical applications, where the plant 
and measurement noise variances may not be strictly constant. This capability may be most important in 
real-time applications in which time does not permit the data to be examined and possibly broken down into 
segments of nearly constant noise variances for subsequent processing. 


Concluding Remarks 

A general solution has been developed for extending the technique proposed by Friedland for estimating the 
plant and sensor noise covariance matrices of a discrete, time-invariant linear system using multiple observers. 
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Furthermore, solutions were developed for the mean and the covariance of the estimates. The general solution 
was specialized to the case in which the noise covariance matrices are diagonal, and this algorithm was 
demonstrated on two examples. 

In the examples the accuracy of the estimates as measured by their theoretical standard deviations, as well 
as by their sample means and sample standard deviations, was very good for some of the components of the 
covariance, but not so good for others. The accuracy depends on the number of observers used in the estimator, 
the number of terms averaged in computing the sample correlation of the observer residuals, the observer gains, 
and the relationship between the observations and the system states. The accuracy of the estimator can and 
should be investigated for each application using the expression for the theoretical variance of the estimate. 
Optimization of the gains to minimize the error in the estimates was not investigated. It should be noted that 
for large systems the technique requires considerable computer memory, since some of the matrices get quite 
large. 

The development of the solution assumed that the noise variances were constant. However, one of the 
examples demonstrated that the technique can be used to estimate slowly changing noise variances if the 
change is sufficiently slow compared with the time over which the products of the residuals are averaged to 
compute the sample correlation. 

An area for future research is the investigation of a procedure for choosing the observer gains to optimize 
the accuracy of the estimator. Also, further work to compare this technique with others in terms of accuracy 
of the estimates and practicality of implementation of the algorithms would be beneficial. 

NASA Langley Research Center 
Hampton, VA 23665-5225 
November 10, 1987 
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Appendix 

Some Matrix Operations 

Kronecker Product 

Consider an m x n matrix A and an arbitrary matrix B, as shown in equation (16). Define the Kronecker 
product A ® B (ref. 11, p. 22) as 


" anB ai 2 B 
a 21 B a 22 B 


A <g> B = 


a lnB 

fl 2 «B 


La m iB a m 2B ••• flmnB- I 


For example, suppose A is 2 x 3 and B is 2 x 2. Then, 


'an&n 

an6i2 

a 12 &ii 

012&12 

013^11 

013^12 " 

011&21 

011^22 

012^21 

012^22 

013^21 

013&22 

021^11 

021^12 

022^11 

022^12 

023^11 

023^12 

-a 2 if»2i 

021&22 

022^21 

a 22&22 

023&21 

023^22 - 


Stacking Operator 

Define the stacking operator S( ) such that if a j represents the rows of the m x n matrix A, then S(A) is 
an mn x 1 column vector as follows (see eq. (17)): 


a 


T 

2 


S(A) = 


For example, suppose A is 2 x 3. Then, 


S(A) = 


an 

ai2 

a 13 

0 2 1 

022 


20 


L023 J 



G Matrix 

Suppose that Q and R, are both 2x2 and diagonal. Then, the stack of S is given by 



and the vector £ is 



For this case the matrix G is defined by the following: 

"1 000000000000000 
G T _ 0000010000000000 
0000000000100000 
.0 00000000000000 1 
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